Genetic and potential antigenic evolution of influenza A(H1N1)pdm09 viruses circulating in Kenya during 2009–2018 influenza seasons

Influenza viruses undergo rapid evolutionary changes, which requires continuous surveillance to monitor for genetic and potential antigenic changes in circulating viruses that can guide control and prevention decision making. We sequenced and phylogenetically analyzed A(H1N1)pdm09 virus genome sequences obtained from specimens collected from hospitalized patients of all ages with or without pneumonia between 2009 and 2018 from seven sentinel surveillance sites across Kenya. We compared these sequences with recommended vaccine strains during the study period to infer genetic and potential antigenic changes in circulating viruses and associations of clinical outcome. We generated and analyzed a total of 383 A(H1N1)pdm09 virus genome sequences. Phylogenetic analyses of HA protein revealed that multiple genetic groups (clades, subclades, and subgroups) of A(H1N1)pdm09 virus circulated in Kenya over the study period; these evolved away from their vaccine strain, forming clades 7 and 6, subclades 6C, 6B, and 6B.1, and subgroups 6B.1A and 6B.1A1 through acquisition of additional substitutions. Several amino acid substitutions among circulating viruses were associated with continued evolution of the viruses, especially in antigenic epitopes and receptor binding sites (RBS) of circulating viruses. Disease severity declined with an increase in age among children aged < 5 years. Our study highlights the necessity of timely genomic surveillance to monitor the evolutionary changes of influenza viruses. Routine influenza surveillance with broad geographic representation and whole genome sequencing capacity to inform on prioritization of antigenic analysis and the severity of circulating strains are critical to improved selection of influenza strains for inclusion in vaccines.

We characterized the genetic and antigenic evolution in A(H1N1)pdm09 viruses circulating in Kenya between 2009 and 2018 using codon-complete gene sequences generated through next-generation sequencing (NGS).We then utilized genetic sequence data and available clinical information to investigate associations of clinical outcome among hospitalized children aged < 5 years.

Study design
Samples analyzed in this study were collected between June 2009 and December 2018 through two health facilitybased surveillance systems in Kenya as detailed previously 13 .The first involved continuous countrywide surveillance for influenza through severe acute respiratory illness (SARI) sentinel hospital reporting undertaken at six sites supported by the US Centers for Disease Control-Kenya (CDC-Kenya) country office: Kenyatta National Hospital (KNH), Nakuru County and Referral Hospital (CRH), Nyeri CRH, Kakamega CRH, Siaya CRH, and Coast General Teaching and Referral Hospital in Mombasa 7,8,[13][14][15][16] .The second was the pediatric viral pneumonia surveillance undertaken at Kilifi County Hospital (KCH) 12 .
In the CDC-Kenya supported surveillance sites, SARI was defined as acute onset of illness (within the last 14 days) among hospitalized patients of all ages with cough and reported fever (feeling feverish) or a recorded temperature of ≥ 38 °C.Furthermore, among hospitalized children aged < 5 years, additional clinical information including difficulty in breathing, lower chest wall indrawing, inability to drink or breastfeed, and nasal flaring were recorded.Chart review was done at the time of discharge or death to collect clinical outcome data.In the second facility-based surveillance undertaken at KCH from January 2009 through December 2018, a definition of pediatric viral pneumonia among children aged 1 day to 59 months presenting with syndromic severe or very severe pneumonia was used.A history of cough for < 30 days or difficulty breathing, when accompanied by lower chest wall indrawing was defined as severe pneumonia; a history of cough for < 30 days or difficulty breathing, when accompanied by any one of prostration (including inability to feed or drink), coma, or hypoxemia (oxygen saturation < 90%) was defined as very severe pneumonia 7,16 .Demographics, underlying diseases, and signs and symptoms were collected from patients who met the case definition; the patients were also assessed by study clinicians on physical and clinical findings.
We selected a total of 418 (31.9%)A(H1N1)pdm09 virus positive SARI samples for this analysis based on realtime reverse-transcription (RT)-PCR cycle threshold (Ct) of < 35.0, adequate sample volume for RNA extraction (> 140 μL), and balanced distribution of samples based on surveillance sites and years.We also identified a total of 157 IAV positive specimens from the KCH surveillance site.These were not previously subtyped for influenza A virus subtypes.Therefore, we utilized all these specimens in the current analysis.Details on sample collection, storage and processing are available in our previous work 13 .

RNA extraction and multi-segment real-time PCR (M-RTPCR) for IAV
We performed viral nucleic acid extraction from IAV and A(H1N1)pdm09 virus positive samples using the QIAamp Viral RNA Mini Kit (Qiagen).We then reverse transcribed the extracted RNA, and amplified the complete coding region of IAV genome in a single M-RTPCR using the Uni/Inf primer set 17 .We evaluated successful amplification by running the products on 2% agarose gel and visualized the reaction on a UV transilluminator after staining with RedSafe Nucleic Acid Staining solution (iNtRON Biotechnology Inc.).

IAV NGS and virus genome assembly
Following PCR, we purified the amplicons with 1X AMPure XP beads (Beckman Coulter Inc., Brea, CA, USA), quantified amplicons with Quant-iT dsDNA High Sensitivity Assay (Invitrogen, Carlsbard, CA, USA), and normalized amplicons to 0.2 ng/μL.We generated indexed paired-end libraries from 2.5 μL of 0.2 ng/μL amplicon pool using Nextera XT Sample Preparation Kit (Illumina, San Diego, CA, USA) following the manufacturer's protocol.We then purified amplified libraries using 0.8X AMPure XP beads, quantitated libraries using Quant-iT dsDNA High Sensitivity Assay (Invitrogen, Carlsbard, CA, USA), and evaluated libraries for fragment size in the Agilent 2100 BioAnalyzer System using the Agilent High Sensitivity DNA Kit (Agilent Technologies, Santa Clara, CA, USA).We diluted the libraries to 2 nM in preparation for pooling and denaturation for running on the Illumina MiSeq (Illumina, San Diego, CA, USA).We the NaOH denatured pooled libraries, diluted to 12.5 pM, and sequenced on the Illumina MiSeq using 2 × 250 bp paired end reads with the MiSeq v2 500 cycle kit (Illumina, San Diego, CA, USA).We added five percent Phi-X (Illumina, San Diego, CA, USA) spike-in to the libraries to increase library diversity by creating a more diverse set of library clusters.We carried out contiguous (contigs) nucleotide sequence assembly from the sequence data using the FLU module of the Iterative Refinement Meta-Assembler (IRMA) using IRMA default settings.We deposited all the generated sequence data in the National Center for Biotechnology Information (NCBI) GenBank database using the accession numbers OR873656-OR874038, OR874040-OR874805, and OR874852-OR875234.

Phylogenetic clustering and genetic group classification
We aligned and translated consensus nucleotide sequences for all gene segments in AliView version 1.26 (https:// ormbu nkar.se/ alivi ew/).Were then reconstructed Maximum-likelihood (ML) trees for the individual gene segments using IQ-TREE version 2.0.7 (http:// www.iqtree.org/).The software initiates tree reconstruction after assessment and selection of the best model of nucleotide substitution for alignment.We linked the ML trees to various metadata and visualized using R ggtree version 2.4.2 in R programming software v4.0.2 (http:// www.rstud io.com/).We used the codon-complete hemagglutinin (HA) sequences of all viruses to characterize A(H1N1) pdm09 virus strains into genetic groups (i.e., clades, subclades, and subgroups) using Phylogenetic Clustering using Linear Integer Programming (PhyCLIP) v2.0 (https:// github.com/ alvin xhan/ PhyCL IP).We downloaded

Predictors of severe infection among hospitalized children aged < 5 years
We used multivariable logistic regression in Stata version 16 (Stata Corp, College Station, Texas, USA) to investigate the predictors of severe infection among hospitalized patients.Only samples collected from hospitalized children aged < 5 years from CDC-Kenya and KCH surveillance were used to estimate the predictors of severe infection.Children hospitalized with fever and acute cough were categorized as severe if they had breathing difficulty and/or lower chest wall indrawing, or otherwise non-severe.The predictors investigated included patient age (categorized as < 12 months, 12-23 months, and ≥ 24 months), location of surveillance sites, year of A(H1N1) pdm09 virus sampling (pandemic period, 2009-2010; post pandemic period, 2011 onwards), A(H1N1)pdm09 virus genetic group, antigenic epitope substitutions, NS1 protein substitutions, and Ct values as proxy for viral load distributed in tertiles.

IAV sequencing and genome assembly
We generated and analyzed a total of 383 A(H1N1)pdm09 virus genome sequences.Among 418 A(H1N1)pdm09 virus positive samples from the CDC-Kenya surveillance system, 414 (99.1%) passed pre-sequencing quality control checks, which generated 344 (83.1%) codon-complete A(H1N1)pdm09 virus genome sequences on the MiSeq.Of the 157 IAV positive specimens available from KCH, 94 (59.9%) passed pre-sequencing quality control checks generating 45 (47.9%)A(H1N1)pdm09 virus (39 codon-complete and 6 partial) and 49 (52.1%)A(H3N2) virus genome sequences (46 codon-complete and 3 partial).For this report, only the 39 codoncomplete A(H1N1)pdm09 virus sequences were included in the analyses.The sociodemographic and clinical characteristics of these patients are shown in Table 1.
Comparison of the deduced amino acid sequences of the viruses identified in hospitalized patients relative to the vaccine strains from which all the 2009-2018 strains considerably evolved revealed significant amino acid substitutions in antigenic epitopes and RBS among the circulating A(H1N1)pdm09 virus strains (Table 2).There were 13 amino acid substitutions across the five antigenic sites among sampled Kenyan viruses.Ranking from the most variable site, Sa, Ca 2, Sb, Ca 1, and Cb had two, six, six, three, and one substitution, respectively.The most frequent HA substitution per site for Sa, Sb, Ca 1, Ca 2 and Cb was K163Q, S185T, S203T, A141E, and S74R, respectively.HA substitutions S203T and S185T were the most dominant substitutions occurring in 100% and 59.3% (227/383) of viruses from 2009 to 2018, respectively.In addition, we observed previously described HA substitutions at the RBS (A186T and D222E) and glycosylation sites (A186T and N125S) of the viruses.

Discussion
We observed that A(H1N1)pdm09 viruses circulating in Kenya from 2009 to 2018 evolved away from their corresponding vaccine strain over the study period.Kenyan virus strains from 2009 to 2016 evolved away from the 2009 to 2017 NH and 2009 to 2016 SH vaccine strain A/California/07/2009 (H1N1pdm09)-like virus and fell into clades 7 and 6, and subclades 6C, 6B, and 6B.1.All viruses from 2018 evolved away from the 2017-2018 NH and 2017 SH vaccine strain A/Michigan/45/2015 (H1N1pdm09)-like virus and fell into subgroups 6B.1A and 6B.1A1.We identified considerable amino acid substitutions in antigenic epitopes and RBS among the circulating viruses, which confirms the continued evolution of circulating influenza viruses in Kenya.We recently reported that the evolutionary dynamics of A(H1N1)pdm09 virus in Kenya was associated with multiple virus introductions between 2009 and 2018, although only a few of those introductions instigated local seasonal epidemics, which then established local transmission clusters across the country 13 .We also observed substitutions in NA protein, which have been associated with reduced susceptibility to neuraminidase inhibitors in vitro 19 , substitutions in M2 protein associated with adamantine-resistance, and substitutions in NS1 protein that possibly result in increased virulence 20 .Nonetheless, we were not able to associate viral genetic changes and substitutions with increased severity.Analysis of virus sequence data from Kenya during the pandemic in 2009 identified the introduction of clades 2 and 7 viruses into Kenya 21 .We recently reported that clades 6 and 7 viruses were introduced into Kenya, disseminated countrywide, and persisted across multiple epidemics as local transmission clusters 13 .Another recent study from Kenya reported the circulation of clade 6B, subclade 6B.1, and subclade 6B.2 viruses in the 2015-2018 influenza seasons 22 .Here, through detailed genomic analysis, we extend these observations and show that multiple influenza strains were introduced into Kenya and spread countrywide over the study period.Most of the amino acid substitutions associated with the continued evolution of A(H1N1)pdm09 viruses in Kenya have also been reported in other studies in Africa [22][23][24][25] and Asia 26,27 .Therefore, the continuing local evolution of A(H1N1)pdm09 viruses in Kenya is in part due to the global circulation of influenza viruses.
The genetic diversity of A(H1N1)pdm09 viruses in specific regions arising from multiple virus introductions and subsequent establishment of local transmission clusters 13 composed of viruses harboring considerable amino acid substitutions in antigenic epitopes and RBS could lead to predominance of circulating viruses that might escape population immunity elicited by previously circulating viruses or to previously selected vaccine strains as shown in our study.In countries like Kenya, where influenza virus spread is year-round 9 , there also exists unpredictability of which genetic virus strain may predominate and when.Influenza vaccines that protect against a broad range of antigenically divergent strains ("universal" vaccines) could be key to managing the influenza disease burden in such settings.Currently, Kenya does not have a national influenza vaccination policy 28 , but it would be important to consider deployment of influenza vaccines with representative A(H1N1)pdm09 virus, A(H3N2) virus, and influenza B virus for optimal vaccine effectiveness 13,29,30 .It will also be important to investigate further whether the use of SH or NH formulated vaccines could have a place in tropical regions like in Kenya, where virus importations from both hemispheres are common.
Specific influenza virus gene segment phylogenies and genetic group memberships have been associated with disease severity 31 .Although we did not observe associations between genetic group membership or substitutions with disease severity, these findings underscore the importance of reporting genetic surveillance data along with epidemiological data to allow for analysis of factors that may increase risk of influenza and impact Table 2. Antigenic drift among A(H1N1)pdm09 virus strains collected from Kenya, 2009-2018.NH Northern Hemisphere, SH Southern Hemisphere.† A (H1N1)pdm09 antigenic sites are represented as Sa-a; Sb-b; Ca 1 -c; , Ca 2 -d; and Cb-e.‡ Recommended vaccines for each influenza season are adopted from Global Initiative on Sharing All Influenza Data (GISAID) (https:// www.gisaid.org/ resou rces/ human-influ enza-vacci ne-compo sition/).disease severity 32 .Although we did not observe between viral load and disease severity, larger studies have reported this association among hospitalized patients with pneumonia infection 33 , which underscores the need for multi-site studies with improved statistical power to estimate associations.We reported a reduction in disease severity with increase in age in children aged < 5 years, which corroborates the evidence that children aged < 6 months may experience more severe influenza related complications 34 .
The study had some limitations.First, the analysis in this report only involved the HA, NA, M2, and NS1 gene segments of A(H1N1)pdm09 virus.Although these regions are important in understanding antigenic drift and antiviral drug resistance in influenza viruses, important changes in other gene segments, for example, mutations associated with increased pathogenicity may not have been captured.Secondly, the prioritized samples were selected based on anticipated probability of successful sequencing inferred from the sample's viral load as indicated by the diagnosis Ct value.Such a strategy ultimately excluded NGS of some samples that may have been critical in inferring additional genetic characteristics of circulating influenza viruses.Lastly, the analysis in this report did not include phenotypic analyses to assess the effect of observed substitutions on virulence, pathogenicity, and transmissibility of influenza viruses.
In conclusion, our study highlights the necessity of timely genomic surveillance to monitor the evolutionary changes of influenza viruses within a country and the necessity of combined genetic and epidemiological data to improve understanding of influenza season severity and guide intervention.Routine influenza surveillance with broad geographic representation and whole genome sequencing capacity to inform on prioritization of antigenic analysis and the severity of circulating strains are critical to improved selection of influenza strains for inclusion in vaccines. https://doi.org/10.1038/s41598-023-49157-3www.nature.com/scientificreports/

Table 1 .
Sociodemographic and clinical characteristics of hospitalized patients in severe acute respiratory illness (SARI) and viral pneumonia surveillances in Kenya, 2009-18.‡ Cough reported within the last 14 days among inpatients of all ages for SARI and < 30 days among children aged 1 day to 59 months for viral pneumonia surveillance.‡ ‡ Chest in-drawing, nasal flaring, unable to drink/breastfeed at all, and lethargy in patients aged < 5 years in SARI surveillance.SARI severe acute respiratory illness.

Table 3 .
Predictors of severe infection among hospitalized children aged < 5 years.aOR adjusted odds ratio, CI confidence interval, Ct cycle threshold.